Tracking Amendments to Legislation and Other Political Texts with a Novel Minimum-Edit-Distance Algorithm: DocuToads

نویسندگان

  • Henrik Hermansson
  • James P. Cross
چکیده

Political scientists often find themselves tracking amendments to political texts. As different actors weigh in, texts change as they are drafted and redrafted, reflecting political preferences and power. This study provides a novel solution to the problem of detecting amendments to political text based upon minimum edit distances. We demonstrate the usefulness of two language-insensitive, transparent, and efficient minimum-edit-distance algorithms suited for the task. These algorithms are capable of providing an account of the types (insertions, deletions, substitutions, and transpositions) and substantive amount of amendments made between version of texts. To illustrate the usefulness and efficiency of the approach we replicate two existing studies from the field of legislative studies. Our results demonstrate that minimum edit distance methods can produce superior measures of text amendments to hand-coded efforts in a fraction of the time and resource costs. ∗Post-Doctoral Researcher at the Centre for European Politics, Department of Political Science, University of Copenhagen ([email protected]). †Lecturer in European Public Policy, School of Politics and International Relations, University College Dublin ([email protected]). ar X iv :1 60 8. 06 45 9v 1 [ cs .C L ] 2 3 A ug 2 01 6

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Algorithm for Digital Distance Relaying

Distance relays are used to protect EHV and HV Transmission lines. Over the past decades many algorithms have emerged for digital distance relays. These are based on the calculation of the transmission line impedance from the relaying to fault points. In this paper a novel method for digital distance relaying is proposed. In the method the tracking procedure is implemented. The method uses the ...

متن کامل

A Novel Algorithm for Digital Distance Relaying

Distance relays are used to protect EHV and HV Transmission lines. Over the past decades many algorithms have emerged for digital distance relays. These are based on the calculation of the transmission line impedance from the relaying to fault points. In this paper a novel method for digital distance relaying is proposed. In the method the tracking procedure is implemented. The method uses the ...

متن کامل

A Wikipedia Based Semantic Graph Model for Topic Tracking in Blogsphere

There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the “word-of-mouth” effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph, in which...

متن کامل

Adaptive Approximate Record Matching

Typographical data entry errors and incomplete documents, produce imperfect records in real world databases. These errors generate distinct records which belong to the same entity. The aim of Approximate Record Matching is to find multiple records which belong to an entity. In this paper, an algorithm for Approximate Record Matching is proposed that can be adapted automatically with input error...

متن کامل

Recognizing Textual Entailment with Tree Edit Distance Algorithms

This paper summarizes ITC-irst participation in the PASCAL challenge on Recognizing Textual Entailment (RTE). Given a pair of texts (the text and the hypothesis), the core of the approach we present is a tree edit distance algorithm applied on the dependency trees of both the text and the hypothesis. If the distance (i.e. the cost of the editing operations) among the two trees is below a certai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1608.06459  شماره 

صفحات  -

تاریخ انتشار 2016